A Penetration Method for UAV Based on Distributed Reinforcement Learning and Demonstrations
نویسندگان
چکیده
The penetration of unmanned aerial vehicles (UAVs) is an essential and important link in modern warfare. Enhancing UAV’s ability autonomous through machine learning has become a research hotspot. However, the current generation strategies for UAVs faces problem excessive sample demand. To reduce demand, this paper proposes combination policy (CPL) algorithm that combines distributed reinforcement demonstrations. Innovatively, action CPL jointly determined by initial obtained from demonstrations target asynchronous advantage actor-critic network, thus retaining guiding role training. In complex unknown dynamic environment, 1000 training experiments 500 test were conducted related baseline algorithms. results show smallest highest convergence efficiency, success rate among all algorithms, strong robustness environments.
منابع مشابه
Learning and Control of UAV Maneuvers Based on Demonstrations
Many maneuvers of Unmanned Aerial Vehicles (UAV) can be considered within a framework of trajectory following. Though this issue can differ from one application to another, they all share the same problem of finding an optimal path (or signal) to perform the specified task. Finding this optimal trajectory is a challenging issue since it depends on both having an accurate mathematical model of t...
متن کاملImproving Reinforcement Learning with Confidence-Based Demonstrations
Reinforcement learning has had many successes, but in practice it often requires significant amounts of data to learn high-performing policies. One common way to improve learning is to allow a trained (source) agent to assist a new (target) agent. The goals in this setting are to 1) improve the target agent’s performance, relative to learning unaided, and 2) allow the target agent to outperform...
متن کاملReinforcement Learning with Multiple Demonstrations
Many tasks in robotics can be described as a trajectory that the robot should follow. Unfortunately, specifying the desired trajectory is often a non-trivial task. For example, when asked to describe the trajectory that a helicopter should follow to perform an aerobatic flip, one would have to not only (a) specify a complete trajectory in state space that intuitively corresponds to the aerobati...
متن کاملReinforcement Learning from Imperfect Demonstrations
Robust real-world learning should benefit from both demonstrations and interaction with the environment. Current approaches to learning from demonstration and reward perform supervised learning on expert demonstration data and use reinforcement learning to further improve performance based on reward from the environment. These tasks have divergent losses which are difficult to jointly optimize;...
متن کاملDynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Drones
سال: 2023
ISSN: ['2504-446X']
DOI: https://doi.org/10.3390/drones7040232